反向运动学(IK)是找到满足一个或多个末端效应器的位置或姿势的限制的机器人联合配置的问题。对于具有冗余自由度的机器人,通常存在无限,不透露的解决方案。当通过工作空间中的障碍施加碰撞限制时,IK问题进一步复杂。通常,不存在产生可行配置的闭合表达,促使使用数值解决方案方法。然而,这些方法依赖于局部优化非凸起问题,通常需要准确的初始化或许多重新初始化来收敛到有效的解决方案。在这项工作中,我们首先将复杂的工作空间约束制定逆运动学,作为凸的可行性问题,其低级可行点提供精确的IK解决方案。然后,我们呈现\ texttt {cidgik}(距离 - 几何反向运动学的凸迭代),这是一种解决这种可行性问题的算法,其具有旨在鼓励低秩最小化的半导体级程序的序列。我们的问题制定优雅地统一机器人的配置空间和工作空间约束:内在机器人几何形状和避免避免都表示为简单的线性矩阵方程和不等式。我们对各种流行的操纵器模型的实验结果比传统的非线性优化的方法更快,更准确的会聚,特别是在具有许多障碍的环境中。
translated by 谷歌翻译
Motion(SFM)的结构最近被称为一个自我监督的学习问题,在该问题中,深度和自我的神经网络模型通过视图合成共同学习。本文中,我们解决了如何最好的夫妇或链接,深度和自我网络组件的开放问题,以便可以在网络之间共享诸如共同规模因子之类的信息。为此,我们介绍了几个耦合的概念,对现有方法进行了分类,并提出了一种新颖的紧密耦合方法,该方法在训练时和测试时利用了深度和自我的相互依存关系。我们的方法使用迭代视图合成来递归更新eGomotion网络输入,从而允许在组件之间传递上下文信息。我们通过实质性实验证明,我们的方法在测试时间促进了深度和自我预测之间的一致性,改善了概括,并导致室内和室外深度和室外深度和自我评估基准的最新准确性。
translated by 谷歌翻译
我们提出了一种新的方法,将深网的功能与几何和概率定位算法的计算效率融合在一起。与其他用深网络完全替代经典视觉估计器的方法相反,我们提出了一种使用卷积神经网络从地面真相训练数据中学习估算器的难以模拟校正的方法。为此,我们根据基质谎言组方法得出了学习SE(3)校正的新型损失函数,其自然表述用于平衡翻译和旋转误差。我们使用这种损失来训练深层姿势校正网络(DPC-NET),该网络可预测特定估计器,传感器和环境的校正。使用Kitti Odometry数据集,我们证明了计算稀疏立体声视觉探针管道的准确性的显着提高,这使其与现代计算密集型密集的密集估计器一样准确。此外,我们展示了如何使用DPC-NET来减轻校准较差的透镜失真参数的影响。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
We propose a fully unsupervised method to detect bias in contextualized embeddings. The method leverages the assortative information latently encoded by social networks and combines orthogonality regularization, structured sparsity learning, and graph neural networks to find the embedding subspace capturing this information. As a concrete example, we focus on the phenomenon of ideological bias: we introduce the concept of an ideological subspace, show how it can be found by applying our method to online discussion forums, and present techniques to probe it. Our experiments suggest that the ideological subspace encodes abstract evaluative semantics and reflects changes in the political left-right spectrum during the presidency of Donald Trump.
translated by 谷歌翻译
Classifying forecasting methods as being either of a "machine learning" or "statistical" nature has become commonplace in parts of the forecasting literature and community, as exemplified by the M4 competition and the conclusion drawn by the organizers. We argue that this distinction does not stem from fundamental differences in the methods assigned to either class. Instead, this distinction is probably of a tribal nature, which limits the insights into the appropriateness and effectiveness of different forecasting methods. We provide alternative characteristics of forecasting methods which, in our view, allow to draw meaningful conclusions. Further, we discuss areas of forecasting which could benefit most from cross-pollination between the ML and the statistics communities.
translated by 谷歌翻译
Incorporating prior knowledge of physics laws and structural properties of dynamical systems into the design of deep learning architectures has proven to be a powerful technique for improving their computational efficiency and generalization capacity. Learning accurate models of robot dynamics is critical for safe and stable control. Autonomous mobile robots, including wheeled, aerial, and underwater vehicles, can be modeled as controlled Lagrangian or Hamiltonian rigid-body systems evolving on matrix Lie groups. In this paper, we introduce a new structure-preserving deep learning architecture, the Lie group Forced Variational Integrator Network (LieFVIN), capable of learning controlled Lagrangian or Hamiltonian dynamics on Lie groups, either from position-velocity or position-only data. By design, LieFVINs preserve both the Lie group structure on which the dynamics evolve and the symplectic structure underlying the Hamiltonian or Lagrangian systems of interest. The proposed architecture learns surrogate discrete-time flow maps instead of surrogate vector fields, which allows better and faster prediction without requiring the use of a numerical integrator, neural ODE, or adjoint techniques. Furthermore, the learnt discrete-time dynamics can be combined seamlessly with computationally scalable discrete-time (optimal) control strategies.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Denoising diffusions are state-of-the-art generative models which exhibit remarkable empirical performance and come with theoretical guarantees. The core idea of these models is to progressively transform the empirical data distribution into a simple Gaussian distribution by adding noise using a diffusion. We obtain new samples whose distribution is close to the data distribution by simulating a "denoising" diffusion approximating the time reversal of this "noising" diffusion. This denoising diffusion relies on approximations of the logarithmic derivatives of the noised data densities, known as scores, obtained using score matching. Such models can be easily extended to perform approximate posterior simulation in high-dimensional scenarios where one can only sample from the prior and simulate synthetic observations from the likelihood. These methods have been primarily developed for data on $\mathbb{R}^d$ while extensions to more general spaces have been developed on a case-by-case basis. We propose here a general framework which not only unifies and generalizes this approach to a wide class of spaces but also leads to an original extension of score matching. We illustrate the resulting class of denoising Markov models on various applications.
translated by 谷歌翻译
The heterogeneity of hardware and data is a well-known and studied problem in the community of Federated Learning (FL) as running under heterogeneous settings. Recently, custom-size client models trained with Knowledge Distillation (KD) has emerged as a viable strategy for tackling the heterogeneity challenge. However, previous efforts in this direction are aimed at client model tuning rather than their impact onto the knowledge aggregation of the global model. Despite performance of global models being the primary objective of FL systems, under heterogeneous settings client models have received more attention. Here, we provide more insights into how the chosen approach for training custom client models has an impact on the global model, which is essential for any FL application. We show the global model can fully leverage the strength of KD with heterogeneous data. Driven by empirical observations, we further propose a new approach that combines KD and Learning without Forgetting (LwoF) to produce improved personalised models. We bring heterogeneous FL on pair with the mighty FedAvg of homogeneous FL, in realistic deployment scenarios with dropping clients.
translated by 谷歌翻译